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I. INTRODUCTION 

A living organism is a complex interconnection of many control units that form a gene regulatory 
network. In developmental biology, clusters of DNA sequence elements (cis-regulatory module) are 
target sites for transcription factors. One cis-regulatory module controls a set of gene expression 
both in space (location) and time 0|. One transcription factor can interact with many modules, 
and one module is controlled by many transcription factors. Thus, the time and space variation of 
a gene expression is a consequence of an interconnected network of interactions. 

With the advent of high throughput technologies (microarrays, proteomics tools) the need for 
quantitative models of gene networks becomes a reality 0, 0, IH| • In recent years we have witnessed 
a growing interest in experiments within the field of systems biology that require mathematical 
models to describe the experimental results |fil. ItI IsL 1^. Hoi. Hll. fl"3] . A mathematical model for gene 
regulatory networks is also closely related with synthetic biology, the engineering counterpart of 
systems biology Ijl 3|- Similar with the development of the field of electronics, where complex 



equipment is built on interconnected simple devices, the field of synthetic bi olog y aims to build sim- 
ple molecular devices for later use in more complex molecular machines 12 1- To build a robust, 



reliable, and simple device, the molecular engineer needs to have a mathematical description of the 
system in order to evaluate the number, range and meaning of a group of parameters that are critical 
for the device functionality. As with any mathematical model of a natural system, the models of a 
gene regulatory network must fulfill certain constraints. At present, the community of researchers 
agrees that a gene regulation model must be nonlinear and stochastic. Nonlinearity is a widespread 
phenomena in life science 3> 3- When two molecules dimerize to form a complex, the concentra- 



tion of the complex is proportional to the product of the concentrations of the molecules involved 
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in the process. Multimerization will require polynomial functions in molecular concentrations 2C|; 
furthermore, rational functions are used to model the most simple gene autoregulatory system 21]. 
The model should be also stochastic, its molecules being subjected to the thermodynamics laws 
of fluctuation. Two cell lines with identical genetic background can show variation in phenotype 
due only to a probabilistic distribution of the molecule number inside the cell j^]. The stochastic 
process that describe molecular interactions are fundamentally discrete; the molecule number can 
change by an integer amount only. Fluctuations in biological systems created by these stochastic 
processes, are described by a Master Equation [2(3]. Approximations to the Master Equation, like 
the Fokker-Planck, Langevin and O expansion, are often use 2^, 24], 2^, 2(1 . However, many biolog- 
ical regulatory systems function with molecules present in low numbers [23]. For such systems, the 
Master Equation should not be approximated 2^, 2^, 3(3, 31]. Another requirement for the model 



is to explain the flow of a signal as it passes through the genetic network. To reveal the structure 
of a gene regulatory interactions, input signals (growth factors, heat shocks, drugs) are inserted 
into different positions within a gene regulatory network. Then, output signals are measured at 
some other positions. A coherent model of the network must explain the measured input-output 
relations and must provide predictions that can be experimentally tested. It is desirable to know 
how measured data is related with the input signal 0, liol . I3H ] . In engineering sciences |33| , control 
theory explains in mathematical terms the relation between the input and output signals (measured 
data). The same theory provides ways to design stable systems using feedback signals. Ideas from 
Control Theory translated into molecular biology will help the design process in synthetic biology. 

In jiO] it is proposed that signal generators controlled by light, [3], should be incorporated into 
the gene regulatory network. With the help of these light-controlled signal generators, different types 
of signal perturbations can be imposed on the gene regulatory network. In [i(| the Master Equation 
was solved for systems that are linear in the transition probabilities. The network's response to 
signal generators was expressed in terms of a transfer matrix for the first and second order moments 
of the stochastic process. Our goal in this article is to construct a mathematical description of a 
gene regulatory network that is nonlinear, stochastic and explains the input-output relations as in 
control theory. The nonlinearity means that the transition probabilities are rational functions in 
the molecule numbers. We do not use Fokker-Planck, Langevin or f2 expansion to approximate 
the Master Equation. For polynomial transition probabilities, the time evolution equation for the 
factorial cumulants will have the following form: 



X aia2 ... an - { Ra 1 a 2 ...a k { t )'^2 X a k+1 \...\a n \Y[m^] } ( L1 ) 

If the transition probabilities are rational functions, the equations are (jIII.54[) . The signal generators 
are part of the coefficients R™ ia2 ... a . (i)- Before the theory is presented, we will solve four genetic 
regulatory networks to explain all the notations and meaning of the variables in p.ljl and pil.54[) . 
The examples are also meant to provide practical applications of the underlying theory. 



II. RESULTS 
A. Two Genes Coupled by a Nonlinear Interaction 

The aim of this subsection is twofold: to introduce the notations that will later be generalized, 
and to explain how the classical control theory model changes due to stochastic effects. Each genetic 
regulatory network is built on a set of interacting molecular species. In the present case the entire 
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protein 1 protein 2 




System 1 System 2 

FIG. 1: Two interconnected systems. 



network is composed of System 1, with mRNAl and proteinl as molecular species, coupled with 
System 2 with mRNA2 and protein2 as its molecular species. The number of mRNAl is denoted by 
ri and similar notations for the other three components. At a given time, the state of the network 
will be a denoted by 

q = (51,92,93,94) = (rupi,r 2 ,P2)- (II. 1) 

The mRNAl is controlled by an input generator and thus the state q will evolve in time. Such a 
generator can be practically constructed using an yeast two- hybrid system, as is described in jsij. 
The light-switch is based on phytochrome that is synthesized in darkness in the Ql form, Fig. 2. 
A red light photon of wavelength 664 nm shined on the Ql form of the protein transforms it in the 
form Q2. Fig. 2 presents the state of the switch after the effect of the corresponding wavelength 
took place. When Q2 absorbs a far red light of wavelength 748 nm, the molecule Q goes back to its 
original form, Ql. These transitions take milliseconds. The protein P interacts only with the Q2 
form, recruiting thus the activation domain to the target promoter. In this position, the promoter 
is open and the gene is transcribed. After the desired time elapsed, the gene can be turned off by a 
photon from a far red light source. Using a sequence of red and far red light pulses the molecular 
switch can be opened and closed. 

The time evolution of the state depends on all possible transitions that can appear in the system. 
For example, from the state (ri,pi,r2,P2) a t time t, the system can move to the state with one more 
proteinl molecule (r\,p\ + l,r2,P2) because mRNAl is translated. This transition is described by 
a vector €2 = (0, 1,0,0) that shows the change in the state: (r±,pi + l,r2,p 2 ) = (^1,^1,^2,^2) + ^2- 
The list of all possible transitions is described in the first column of Table 1. Which transition will 
actually take place is governed by a stochastic process j^. A third element in the model (beside the 
state and transitions) is the set of all transition probabilities T e . If the state at time t is q, then the 
probability of the system to jump in the state q + e at the time t + dt is T e (q, t)dt. The presence of 
the time t in the argument of the transition function show that, in general, the transition depends 
not only on the number of molecules in the system, but also on the moment in time when it is 
recorded. T tl in Table 1 is a time dependent transition probability. It contains the signal generator 
G(t) that modulates the mRNAl transcription. 

All transition probabilities are given in the second column in Table 1. The coefficients that 
multiply the molecule number in a transition probability are the parameters of the model. Higher 
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LEGEND: AD = ACTIVATION DOMAIN 
BD = BINDING DOMAIN 

Q = A PROTEIN THAT CHANGES ITS FORM UPON LIGHT EXPOSURE, 

FROM Ql TO Q2 AND BACK 
P= A PROTEIN THAT INTERACTS ONLY WITH FORM 2 OF PROTEIN Q, (Q2) 



FIG. 2: Signal generator. Adapted from ref. 



values of the coefficient will make that transition to appear more often per unit of time. The last 
column of the table will be explained later in the text. The the dynamic of the genetic network is 
given by the time evolution of the probability of the network to be in the state q at time t: PUht). 
The equation for the time evolution of the state probability is know as the Master Equation [20( . 



TABLE 1: Nonlinear Coupling of Two Linear Systems 





Transitions 


Transition 
probabilities 


Coefficients 


Polynomial basis 


Signal Generator 


ei = (1,0,0,0) 


T t M=G(t) 


M E1 (t) = G(t) 


e (0, 0,0,0) = 1 




c_i = (-1,0,0,0) 


T e _ 1 (q) = 7n»"i 


ML, = 7pi 


e (l, 0,0,0) ~ r l 


Linear System 1 


£ 2 = (0,1,0,0) 


Te 2 (q) = km 


M} 2 = ki 


e (l, 0,0,0) = r l 




e_ 2 = (0,-1,0,0) 


T c- 2 {l) = IpiPl 


M?_ 2 = 7 P1 


e (0, 1,0,0) = Pi 


Nonlinear Coupling 


e 3 = (0,0,1,0) 


T e3 [q)=hpi = 
= ftpi(pi - 1) + hpx 


Ml = h 
Mf 3 = h 


e (0, 1,0,0) = Pi 

e (o,2,o,o) = Pi (Pi - !) 




e_ 3 = (0,0,-1,0) 


T e- 3 {<l) =7r 2 r2 


Mf_ 3 = lr 2 


e (0, 0,1,0) = r 2 


Linear System 2 


6 4 = (0,0, 0,1) 




Ml = k 2 


e (0, 0,1,0) = r 2 




e_ 4 = (0,0,0,-1) 


Te-M) = 7p 2 P2 


ML* = 7 P2 


e (0,0,0,l) = P2 
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P(n, Pl ,r 2 ,P2,t + dt)= {11.2) 
P(ri - l,pi,r 2 ,p 2 ,t) T ei (n - l,pi,r 2 ,p 2 ,t)dt + P(n + l,Pi,r 2 ,P2,t) IL^n + l,p 1 ,r 2 ,p 2 ,t)dt + 
P(n,Pi ~ l,r 2 ,P2,t) T e2 (n,pi - l,r 2 ,p 2 ,t)dt + P(ri,pi + l,r 2 ,p 2 ,t) T e _ 2 (n,pi + l,r 2 ,p 2 ,t)dt + 
P(ri,pi,r 2 - l,p 2 ,t) T e3 (ri,pi,r 2 - l,p 2 ,t)dt + P(n,pi,r 2 + l,p 2 ,t) T t _ a (ri,pi,r 2 + l,p 2 ,t)dt + 
P(ri,Pi,r 2 ,p 2 - l,t) T e4 (n,pi,r 2 ,p 2 - l,t)dt + P(n,p 1 ,r 2 ,p 2 + l,t) T t _ 4 (r 1 ,p 1 ,r 2 ,p 2 + l,t)dt + 

P(ri,Pi,r 2 ,P2,*) U - T ei (r 1 ,p 1 ,r 2 ,p 2 ,t)dt - T e _ 1 (r 1 ,p 1 ,r 2 ,p 2 ,t)dt - T e2 (n,pi,r 2 ,p 2 ,t)dt - 

T e _ 2 (r 1 ,p 1 ,r 2 ,p 2 ,t)dt - T e3 (r 1 ,p 1 ,r 2 ,p 2 ,t)dt - T e _ 3 (r 1 ,p 1 ,r 2 ,p 2 ,t)dt- 

T t4 (r 1 ,p 1 ,r 2 ,p 2 ,t)dt - T e „ 4 (r 1 ,p 1 ,r 2 ,p 2 ,t)dt) . 

Given that the system was at time t in the state q = (ri,pi,r 2 ,p 2 ), the first 8 terms in ()II.2|) 
represent the probability that the system will be in a new state at the time t + dt. The new 
state depends on which transition actually took place. The rest of the terms in (|II.2|) express the 
probability that no transition will take place in (t,t + dt). Dividing by dt and taking the limit 
dt — > 0, the above relation between probabilities takes the form of a partial differential equation: 



-P (q, t)=J2 Te(q ~ e, t)P(q - e, t) - £ T € (q, t)P(q, t) . (11.3) 

e e 

This equation is known as the Master Equation for the jump Markov processes with discrete 
states [2(1 The summation is over all possible transitions. It is hard to solve this equation, 
even in very simple examples. However, we are interested in the mean number of molecules of 
different species and in their standard deviation. Or more generally, we want to know the correlation 
between different molecular species. The quantities of interest are thus means of products of state 
components: 



(q m ) = J> m PM), (II.4) 

Q 

where the sum goes over all possible states. The notation m stands for a vector m = (mi,m 2 , m^^m^) 
of integer numbers. The number of components of rn is the same as the number of components in 
the state q. The power of q to the m is defined as q m = q™ 1 q™ 2 q™ 3 q™ 4 . We need a line on top of 
m to distinguish it from a simple m that will be used heavily in what follows. 

The time evolution for (q m ) can be obtained from the Master Equation (|II.3|) using the z- 
transform of the state probability P(q,t): 



F(z, t) = Z(P(q, t)) = z q P(Q, t) (II.5) 

where z = (z\,z 2 ,Z3, Z4) are variables in the complex plane and the power z q was defined above. 
Quantities of interest are means of polynomials in the components of the state variable q, like (q m ). 
A natural way to obtain means of polynomials in q is by taking derivatives with respect to z in 



G 



1)11. 5|) and then put %i = 1, i = 1 . . . 4. For example: 

gFfot) , m ^ 

(n> = -^U. (°-6) 

(P2(P2 - 1)) = U=i • (IL7) 

We notice that the derivatives of F(z, t) bring us to a set of polynomials that are known as 
decreasing factorials: 



ZmMk) = ?fc(?jfc - l)-(?Jt - m/fc + 1) , (II.8) 
em(?) = e mi (gi)e m2 (<?2)e m3 (g3)e m4 (<?4) . (II.9) 

We will use the polynomials emiq) as a base, to express all the transition probabilities, as is 
explained in the last column of Table 1. In this base, the results are easy to express in terms of the 
derivatives of F(z,t). A decreasing factorial has a physical interpretation, |20J. In a system with 
qk molecules of specie k, the probability for a collision involving such molecules is proportional 
with e mk (qk)- In other words, the probability for multimer formation is described by a decreasing 
factorial. 

Every polynomial can be expressed as a linear combination of the basic polynomials &m(q)- To 
make a distinction between the moments {q m ) and (&m;(q)), the later one is known as a factorial 
moment. The first order moments (which are actually the means of the state variable) and the first 
order factorial moments are equal. 

The variables that describe the system are thus 

(e^(g)) =d m F(z,t) \ z=1 , (11.10) 
which also displays the tensor index m. From a vector index 

m = (mi, m2, UI3, 7714) , (H-11) 

we can construct a tensor index 

m = 11. -.1 22. ..2 33. ..3 44.. .4 , (11.12) 

mi rri2 rriA 

and vice versa. The tensor index m is useful for ordering the variables as they come from partial 
derivatives. 

d m = dZ\ . . . Zt, Z 2 . . . Z 2 , Z 3 . . . Z 3 , Zi . . . Z4 . 

mi mj m-i rrn 

For example 

{rir 2 P2(P2 ~ 1)(P2 - 2)} = 9i344 4 F \ g=1 , 

because r\ is on the first position in the state q = (ri,pi,r2,P2), and only one derivative with 
respect to z\ is required. The protein p 2 is on the 4th position and it takes 3 derivatives to obtain 
P2(P2 - 1)(P2 - 2). 



7 



Instead of always showing that after a partial derivative we have to insert z = 1 in the expression, 
we will use the following notation 



d m F \ z=1 = F m . (11.13) 

For special examples like the one we work with, we can use suggestive notations for the indices 
z = (z ri , z Pl , z T2 , z P2 ). We list the first F m variables in order as they appear in the Taylor expansion 
of the function F(z,t) about z = 1. 



Fri j F r2 , F Pl , Fp 2 , F riri , F riPl , F rir2 , F ri p 2 , . . . , F P2P2 , F ririri , F ririr2 ... . 

These variables change in time as the generator G(t) drives the system. From these variables we 
can read the mean values of the molecules like (r 2 ) = F r2 and also their standard deviation from the 
mean, ((r2 — (?"2)) 2 ) = F r2T2 + F r2 — F^ 2 - The time evolution of the variables F m is a consequence 
of the Master Equation in F(z,t): 



d t F(z,t) = G(t)(z ri - l)F(z,t) + (11.14) 
7 n (l - z ri )d Zri F(z,t) + kx(z pi - l)z ri d Zri F(z,t) +j pi (l ~ z Pl )d Zpi F(z,t) + 
h(z r2 - 1) (z 2 pi d ZpiZpi F(z,t) + z Pl d Pl F(z,t)) + 

7r 2 (l - z r2 )d Zr2 F(z,t) + A; 2 (zp 2 - l)^ 2 ^ r2 F(z, t) +7 P2 ( 1 - Zp 2 )d Zn F{z,t) . 

The right side of this equation is composed of three pieces. The first piece contains variables 
only from the System 1 (first 4 terms). Then comes a term proportional with the coupling constant 
h that show how the two systems are connected. The last three terms are specific to System 2. 
The coupling terms contain the second derivative with respect the protein 1, which is the input 
signal into the second system. Derivatives of order more than 1 are a sign of nonlinearity. Here the 
coupling transition probability T €3 is a quadratic function in the protein number of System 1. 

The time evolution equations of the F m variables can be obtained from ()II.14|) by taking partial 
derivatives with respect to different combinations of the components of z and then inserting z = 1. 
Through this procedure, the partial differential equation for F(z, t) transforms into an infinite 
system of ordinary differential equations for the F m variables (the tensor index m takes all the 
possible values). We present the first few: 



F ri 


= G{t)- lri F ri 




r pi 


— k\F Tl 7pi^pi 




L r2 


= h(Fp 1 + F pipi ) — 7 r2 


TP 

r T2 




= k2F r2 — r )p 2 Fp 2 




TP 

r ri P 2 


= G(t)F P2 + k2F riT2 — 


(7ri +7p2^ 


TP 

l r 2 r 2 


= 2h(F pir2 + F pi p ir2 ) 


— 2j , y r2 F r2r2 



If the system is linear, i.e. all the transition probabilities are linear in the state variables, then 
the infinite system can be closed to a finite one. Namely, if we collect all variables up to the modulus 
|m| = Max, (|m| = mi + • • • + m n ), than we create a system of equations that do not depend on a 




FIG. 3: Graph of the equations for the coupled systems. 
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fn that has a modulus greater than Max. The system, being finite, is completely solvable |3(|. 
For the nonlinear transition probabilities, the system is usually not finite. The variable F m will 
depend on some other F t with \m'\ > \m\. 

Thus we need to cut the infinite system to obtain an ordinary system of equations: discard all 
F m s with |m| greater then sum cutoff value. The problem that we run into is that the solution 
for variables F m with small \m\ depends on the cut even if the cut is taken at high values of \m\. 
Moreover, from a stable infinite system, by cutting we obtain an unstable system. The cause of this 
behavior is that as |m| increases, the values of F m increases because it represents a mean of a high 
power polynomial in the state variables. Discarding such high values from the equations causes the 
aforementioned instability. However, a finite system can be obtained if instead of the variables F m 
we change to a new set of variables that have small values that can be neglected for higher values of 
\m\. These variables represent for the factorial moments what cumulants represent for the classical 
moments |3(|. The new function, call it X(z,t) is given by 

F(z,t) = e x ^ l) . (11.15) 

The equation for the time evolution of X(z,t) is 

d t X(z,t) = G(t)(z ri -1) + 

7n(l - z n )d Zri X(z,t) + k 1 (z pl - l)z n d Zri X(z,t) +j Pl (l - z Pl )d Zpi X(z,t) + 

h{z r2 - 1) (z 2 pi d ZpiZpi X(z,t) + z 2 pi (d Zpi X(z,t)) 2 + Zpi d Zpi X(z,t)) + 

7 r2 (l - z r2 )d Zr2 X(z,t) + k 2 {z P2 - l)z r2 d Zr2 X(z,t) + -y P2 (l - z P2 )d Zp2 X(z,t) . 



We notice that the term proportional with h couples the two linear systems. Taking partial deriva- 
tives with respect to z and inserting z = 1, we obtain the variables X m , indexed by the tensorial 
index m. The system of equations for these variables is represented as a graph in Fig. 3. Each 
node represents the time derivative of the variable written inside the node. A line entering the node 
corresponds to one term on the right side of the equation for that node; the term is the product 
of the variable written inside the start node and the coefficient above the line. For example, the 
equation for the variable X T2P2 is 

— k,2X r2 + f£2X r2r2 (^/r2 Tp2 ) "^f*2P2 h(.XpiP2 -^PiPlP2 ^-^Pl -^P1P2 ) • (11.16) 

The indices T2P2 of X T2V2 belong to System 2 as well as three other terms that have indices from 
that same system. The term k,2X T2 is represented by the line starting on X r2 and ending on the 
X T2P2 node. The coupling coefficient h multiplies the polynomial combination X pip2 + X pipiP2 + 
2X pi X piP2 . These polynomial combinations stem from the nonlinear coupling and their role is to 
connect System 1 with the System 2. If we are interested only in the mean value of the protein 
P2, that is X P2 , then we only need to solve the equations for the System 1, compute the coupling 
factor X Pl + X pipi + X P1 and solve the equations for System 2. However, if we ask for the standard 
deviation of the number of protein molecules p2, then we need X P2P2 which requires solving for the 
coupling variables X rir2 , X riP2 , . . . X PlPlP2 , see Fig. 3. When it comes to solving for the coupling 
variables, we find that variables up to fourth order in System 1 are necessary (like X riririri ). In 
general, suppose we need to solve System 2 in order n, that is we need X m with \m\ = n and all 
components of the index m contain r2 or p2- Then, we need an order n + 1 in the coupling variables 
and n + 2 in System 1. Another observation is that even if the coupling is nonlinear, the entire 
system of equations is finite. This property is a reminiscent of the fact that System 1 and 2 are 
linear. A linear system, whose equations are depicted in Fig. 3 in the upper left corner, has the 
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property that the order n depends only on orders that are smaller or equal to n. Thus, the equations 
for a linear system have a hierarchical structure, Fig 3, System 1. In |30j the equations for the first 
two orders for a linear system were solved. The second order variables like X riri , X pipi were called 
non-Poisson components in [3(|. This name came from the fact that the standard deviation can be 
expressed as = X T + X rr and for a Poisson process = X r . We conclude this section by noting 
that two stochastic systems (System 1 and System 2) are not coupled by simply taking output 
variables from System 1 and input them into System 2. The stochastic coupling requires that the 
output of the System 1 should first pass through an intermediate system and then enter into the 
System 2. 



B. Molecular Diagrams 



A graphical representation of gene regulatory networks is essential in order to capture information 
about gene interaction. A notational system should satisfy four important criteria 

(1) Expressiveness: the diagram should describe every possible relationship among molecules. 

(2) Semantically unambiguous: different symbols should be assigned to different semantics. 

(3) Extension capability: the notation system should be flexible, so that new symbols can be 
added in a consistent manner. 

(4) Mathematical translation: each diagram should be able to be converted into a mathematical 
formalism for use in quantitative computations. 

A B 



Component of the state q 



Ik 



e m£V- VV l »- | V in k tl1 



e m (q.)e m (q ; ) 
m k k ttte i 



a b 



he r ue- 



Transition 



Control lines of a transition 



algebraic sum of all terms 
I that come through the lines 
that end on a transition 



Action lines of a transition 





e 











The line |E|| ends 
on the molecule q . 



FIG. 4: Rules for molecular diagrams. 



Based on the mathematical model for the stochastic genetic networks, we can assemble a set of 
rules to construct diagrams that obey the above criteria. The building blocks of the model are: the 
state q, the transitions e and the transition probabilities T e . Each of these building blocks will be 
represented in a diagram, whose graphical notations are depicted in Fig. 4. The component is 
represented by an oval, Fig. 4A, first row. If the molecules from a specific biological context are 
classified in families (antibodies, cytokines, etc.) then, instead of using an oval for each molecule, 
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a specific geometric shape can be associated with each family. For example, to distinguish between 
an mRNA and a protein, we used a quadrilateral symbol for mRNA and an oval for the protein. 
Each transition will be represented by a square, Fig. 4B first row. If necessary, the transitions can 
be grouped in phosphorylation, transport, transcription, etc. For each class of transition a different 
geometric symbol can be used. The transition probabilities are built upon decreasing factorials 
()H.8|) . A decreasing factorial e mk (qk) will be represented by a line starting from the component q^, 
Fig. 4A second row. If the coefficient in front of a decreasing factorial is positive, the line will carry 
an arrow; otherwise the line will end in a bar. We will not write on top of its corresponding 
line if = 1. A product of two decreasing factorials is represented by joining the lines of each 
of the term in the product, Fig. 4A third row. Graphical representation of a rational function 
is depicted in Fig. 4A fourth row. The lines representing the terms from the denominator of the 
rational function are marked by a filled square. Two types of lines are associated with a transition 
e. One type ends on the border of the square representing the transition, Fig. 4B second row. These 
lines originate on different components qt on which the transition probability T e (q) depends. The 
components q^ control the transition probability T t (q) so the lines that end on the boundary of the 
transition symbol are called control lines. Each transition will act on some molecules to change their 
number. This action is described by the components of the transition e. Each nonzero component 
of an e will be associated with a line that starts from the center of the transition symbol, Fig. 4B 
third row. These are called action lines. Each action line ends on a component q^ that is changed 
by the corresponding transition. If the component of a transition e is positive, the line is marked 
by an arrow and by a bar if the component is negative. In Fig. 4B third row, < 0, ej < and 
gfc > 0. If |ej| = 1 we will not write it on its corresponding line. Finally, terms that correspond to 
lines that end on a transition e must be summed to form the transition probability, Fig. 4B second 
row. The molecular diagram for the system under study, Fig. 5, follows from Table 1 and the rules 
from Fig. 4. 

The generator line has m = written on it, which corresponds to e( 0j o,o,o) = 1> from Table 1. 
This line does not reach the center of the transition ex, indicating that it is a control line. The 
transition is controlled by the generator and its transition probability is readable from the diagram: 
T tl = G(i)e(o,o,o,o) = From the center of the transition ei starts a line that points to the mRNA 

symbol r\. This line, because it starts from the center, corresponds to the effect of the transition and 
tells that only one component of e\ is not zero, ei = (1, 0, 0, 0). The line has an arrow that indicates 
that this component is positive, so the transition will cause an increase of mRNA by one molecule. 
All other lines can be read in the same manner. From the protein p\ symbol there are two lines that 
control the transition e^. These lines represent the nonlinear coupling T ts = hp\ = hp\ + hp\(jp\ — 1) 
decomposed in the factorial bases. In Fig. 5, the line marked with the number 2 corresponds to the 
term hp\{p\ — 1) whereas the other line indicates the term hp\. We will use molecular diagrams to 
present the interactions for each example that follows. 

C. Units of Measurement 

The transition probabilities T e (q,t) are measured in [seconds] -1 and thus the coefficient M™(t) 
from T e (q,t) = ^ m M™(t)e m (q) is measured in [moles] - '" 1 ' [seconds] -1 . At present, the numerical 
values for the molecular constants M™(t) that govern the mechanisms inside the cell can not be 
determined precisely through laboratory experiments. In what follows a numerical coefficients will 
be written without a unit of measurement, to show that it is not experimentally determined. We 
use numerical values to compare the analytical theory with Monte Carlo simulations and to show 
how the general formulas can be use in practical applications. 
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FIG. 5: Molecular diagram for two coupled systems 



D. Hill Feedback Control 



One of the basic elements of a gene regulatory network is a gene that controls its own transcription 
38H . The protein acts on mRNA production through a term of the form 



P 



2 ' 



(11.17) 



When the number of protein molecules increases, the rate of mRNA production will decrease, 
stabilizing the system's transcription and translation. This kind of feedback control is employed 
in the description of many biological systems. In 0| it is used to exp lain the appearance of 
multist ability in the lactose utilization network of Escherichia coli. In [391 ] it is used to describe a 
stable oscillator constructed from three genes that repress themselves in a closed loop. 

Our study focuses on the case where a signal generator accompanies the feedback. When the 
generator is turned off, the gene is driven slowly by the nonlinear feedback. This special case will 
be addressed in the following subparagraph. From a mathematical point of view, this system is 
interesting because the transition probability for repression is a rational function. The Table 2 and 
Fig. 6 presents the structure of the system 

The lines that do not start on any molecule represent the coefficients a\ and b\. The coefficient 
bi, being a term in the denominator of the feedback transition probability, has a small square 
superimposed upon it. 

The Master Equation for the state probability is transformed by multiplying the whole equation 
ifTOjl with 6i + b 2 p + b 3 p(p- 1). 
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TABLE 2: Hill Feedback Control 



ei = (l,0) 


e_i = (-l,0) 


£2 = (0,1) 


£-2 = (0,-1) 


t - rtt) i ai + a2P 
ei [ > b 1 + b 2 p + b 3P (p-i) 


T e _, = lr r 


T t2 = Kr 


Te-2 = IpP 




- 1 

FIG. 6: Autoregulatory gene. The feedback has a Hill coefficient of 2. 



(6i + b 2 p + 6 3 p (p - 1)) — P( r , p, t) = (ai + a 2 p ) (P(r - 1, p, t) - P(r, p, t)) + 

(bi + b 2 p + b 3 p(p — l))[G (t) (P(r - 1, p, t) - P(r, p, t)) + 7r (r + 1) P(r + 1, p, t) - 7r r P(r, p, t) + 
KrP(r,p- l,t) -KrP(r,p,t) +J P (p+ l)P(r,p+l,t) - j p p P(r,p,t)) . 

It is now possible to take the z-transform (|II.5|) and then change the variable from F(z, t) to 
X(z,t), pi,15j) . The dynamical equation for X(z,t) contains both the effects of the nonlinear 
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feedback as well as the input signal generator G(t): 

hdtX + b 2Zp (d p d t x + dtXdpX) + b 3 z p 2 (dppd t x + d t xd pp x + 2 d p xd t d p x + {d p xf d t x 

G (t) (z r - 1) (&i + b 2 z p d p X + b 3 z p 2 (d pp X + (d p X) 2 )^j + (z r - 1) (ai + a 2 z p d p X) + 

7r (1 - Zr) (bld r X + 6 2 Z p (d rp X + d r Xd p X) + 

&3^ 2 (5 rpp x + d pp xd r x + 2 d p xdr P x + (d p x) 2 a r x)) + 

Kz r (\ (z p - 1) <9 r X + b 2 z p d r X + 6 2 z p (* P - 1) {drpX + d r XdpX) + 

2 6 3 z p 2 (a rp x + (a r x) + 
6 3 ^ 2 - 1) (OrppX + d pp xd r x + 2 dpXdrpX + (d p xf d r x)) + 

7p (&i (1 - z p ) d p X - b 2 z p d p X + b 2 z p (1 - z p ) (<9p P X + (d p X) 2 ) - 

2b 3 z p 2 (d pp X + (d p X) 2 ) + 

b z z 2 (i - zp) (d ppp x + 3 d pp xd p x + (d p xf)) . 



The time dependent variables are the factorial cumulants and can be obtained from the Taylor 
expansion of X(z, t) about z = 1. Thus, we obtain an infinite number of equations; the first three 
of these are: 

b 2 X p + 6 3 [x pp + 2 X P X P ^ = K (b 2 X r + 2b 3 {X rp + X r X p )^j (11.18) 
- lp (b 2 X p + 2b 3 {X pp + {X p ) 2 ) 



b\X r + b 2 {x p X r + X^ + 63 ^2 XpXrp + X pp X r + 2 X rp X p + (X p ) 2 X r + X rw ^ 
G (t) (bi + 6 2 X P + 6 3 (^ w + (X p f)^ + ai + a 2 X p 

~ 7r + ^2 (X r p + X r X p ) + ^(Xj-pp + X pp X r + 2 X p X rp + (X p ) 2 X r 

-\-K {b 2 (X r + X rr ) + 2 63 (X rp + X r Xp + X rrp + X rr Xp + X r X rp ) 

— 7p ( &2-^rp + 2 63 (X rJ3? , + 2 X p X rJ ,) 



61 X p + 6 2 X p + b 2 [X pp + XpXpJ + 2 63 (X PP + 2 X P X P J (II. 19) 

+&3 {{X p ) 2 X p + 2 X p X pp + X ppp + 3 XppXp^ = 

K (b\X r + fe 2 (X r + 2X rp + X r X p ) + 63(4X rp + \X r X p + 4X p X rp + 3X pp X r + 3X rpp + (X p ) 2 X r )^ 
-7p + M^p + 2X pp + (Xp) 2 ) + 6 3 (4Xpp + 4(Xp) 2 + 3Xppp + 7XppX p + (X p ) 3 )) . 

If a\ = and a 2 = 0, then the nonlinear feedback disappears and the system becomes linear. The 
equations then factorize into the simple equations discussed before for System 1, Fig. 3. When the 
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feedback is nonzero, that is a\ ^ and 7^ 0, the equations does not factorize. The left side of 
each equation in ()II.18|) is polynomial in the time derivative X m and X m . This is characteristic for 
the rational transition probabilities, as will be proven in section 4. To assure that the transition 
probabilities are positive, we will work with a positive signal generator G(t) > 0. We will split the 
generator G(t) into a constant component G and a time variable component g{t) 

G(t) = G + g{t). (11.20) 

For the constant component we take G = {G max — G m i n )/2 with G max and G m i n being the 
maximum and respectively the minimum value of G(t). With this choice for G we have \g(t)\ < G, 
so solutions to the equations ()II.18j) can be found by the method of expansion with respect to a 
small parameter. Namely, insert a parameter r/ in 

G(t) = G + rjg(t), (11.21) 

and generate approximations, X m ^, by collecting the like powers in 77 : 

X m (t) = X m>0 + r)X m>1 {t) + r) 2 X m>2 (t) + ... ; (11.22) 

then eliminate rj by setting it to 1. The constant term G fixes a stationary level X m ^ that 
obeys a system of equations obtained from (jll. 18|) by eliminating any time derivative and putting 
G instead of g{t). The solution to the stationary case is interesting from a practical point of view 
and will explored it in the next paragraph; afterwards, we will return to study X m ^{t). 

1. Designing the Shape 0/ a Logic Pulse 

In electrical engineering systems, properly connecting equipment along a signal path requires 
strict compliance with various standards. The logic l's and 0's must be designed in such a way that 
they will be detected correctly after passing through chains of devices. A TTL device is guaranteed 
to interpret any input above 2 volts as a logic 1 or true and any input below 0.8 volts as a logic 
or false; thus, there is a 1.2 volts protection against noise. Translating these ideas to a molecular 
device, we want to use the autoregulatory system to generate a logic pulse in protein numbers. Then 
let G\ be the input signal for a protein level that represents a logic and G2 for a logic 1. 

G 2 
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FIG. 7: Constant Levels for £. 



16 



Because the system is intrinsically stochastic the logic levels refer to the mean values of the 
protein, (pi) and (p 2 ) respectively. The protein values will fluctuate around these mean values. A 
measure of this fluctuation is the standard deviation of the protein numbers o\ = ^/{{(pi — (pi)) 2 )) 
and similar for a 2 . To separate the logical levels we will ask that the following ratio: 

i = (H.23) 
(pi) + 0"1 

be high enough. The constant contour plot of the ratio £ is presented in Fig. 7 for the following set 
of numerical parameters 

{ 7r = 2, 7 P = 1, K = 0.5, ai = 1, a 2 = 0, b x = 0.01, b 2 = 0.001, b 3 = 0.001} . (11.24) 



number of 
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molecules 
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FIG. 8: Pulse design. 



To design a shape for a pulse we first fix G\ for the logic and then choose a value for the ratio 
£. From Fig. 7 we read the input G 2 for the logic 1. For example, if G\ = 10 and £ = 2 we obtain 
G 2 = 95, which is the point A on the graph. The pulse for these values is shown in Fig. 8. This figure 
also shows that the analytical values are confirmed by a Monte Carlo simulation using the direct 
Gillespie algorithm, see Materials and Methods. The simulated means and standard deviations are 
based on 500 independent stochastic processes. The pulse is separated from the baseline level by a 
factor of £ = 2 and only a few of the simulations drop down in the region {pi) ±01. The comparison 
of the analytic formulas with the Monte Carlo results proves the power of the method outlined, 
especially that the variables X m are well suited for numerical computations. The next paragraph, 
and other examples that follow, will show the effectiveness of the factorial cumulants. 



2. The Autoregulatory Gene Driven Only by its Protein Level 



If G = 0, the generator is closed, leaving only the feedback to sustain the mRNA production. 
An equilibrium between mRNA and the protein level will take place. Using the traditional method 
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of mass action (chemical equilibrium), we can compute this equilibrium by equating the production 
rates with the degradation rates: 



h + b 2 p + b 3 p(p 



1) 

kr 



7 r r 



(11.25) 
(11.26) 



The mass action procedure assumes that the stochastic process is Poisson, so the size of the 
standard deviation from the mean equals the square root of the mean. We found that the mass 
action procedure does not explain the data obtained from Monte Carlo simulations, Fig. 9 and 
Table 3. However, we match the simulations by solving the first N equations of the infinite system 
of equations ()II.18|) . As iV increases, the solutions more closely approach the simulated data; see 
Fig. 9 where N = 7 and 15. 



TABLE 3: Analytical Solutions Explain Monte Carlo Simulations 





Monte Carlo 


15 equations 


7 equations 


mass action 


mean 


12.38 


12.37 


12.16 


11.54 


mean + standard deviation 


17.55 


17.50 


16.76 


14.94 



analytical solution 
mRNA 



, mean 

simulation 

|_mean + siandnrd deviation 
15 equations I 
7 equations 2 
mass action 3 




FIG. 9: Analytical solution explaining the simulated data. 



As we take more equations we obtain better results, so the variables X m (factorial cumulants) 
are suited for numerical and analytical approximations. The conclusion is that the mean values 
depend on cumulants of higher orders and thus equations that mix the means with cumulants of 
higher orders must be solved simultaneously. 
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3. Response of the Nonlinear Autoregulatory Gene to a Time Variable Input Signal Generator 



The generator is now time dependent (|II.21} 



G(t) = G + g(t) 

The solution in the first order, (jll.22|) . is X m (t) = X m> o + X m> i(t). At this point we can move 
from tensor notations to a matrix notation to present the solutions in the usual form for input- 
output relations from control theory. We construct thus column vectors Xm\ and Xm using the 
lexicographic order for the tensor index m. For example 

-^(1) = [X r ,i, X p ,i, X rr> i, X rPt i, X PPt i, . . . ] . (11.27) 

which is written in a transposed form to save space. 

Using pi. 211) and equating the terms which contain r\ from both sides of (jll. 18|) we obtain a linear 
time evolution equation for Xn\ 



EX {1) =AX {1) +Bg(t), (11.28) 

where E and A are infinite square matrices and B is an infinite column vector. The entries of 
these matrices depend on the parameters of the system as well as the stationary solution vector A( ) . 
From these infinite systems of equations we construct a finite system with a dimension depending 
on how many orders for the factorial cumulants X m we want to keep (that is what is the maximum 
value for \m\ we need). To obtain a nonsingular matrix E the first equation in HII.18|) must be 
omitted. This equation is a consequence of the transition probabilities being a rational function. If 
the transition probabilities were only polynomials in the state variables, then this first equation will 
become a trivial = 0. A 2 x 2 finite system will have the following matrices: 

bi + b 2 X Pi o + bsX pPi o + b 3 Xpfi 2 2 63 X rr fi 

bi + b 2 + b 2 X pfi + 4 63 X P:0 + 3 b 3 X pP:0 + b 3 X pfi 2 _ 



A 1A = - 7r (61 + b 2 X pfi + 63 (X ppfi + X p /)) +k(b 2 + 2 b 3 X pfi ) + 2 kb 3 X rp , 
A 1>2 = G(b 2 + 2 b 3 X pfi ) +a 2 - lr (b 2 X rfi + 63 (2 X rpfi + 2 X r , X pfl )) 

+2 kb 3 X rfi + 2 kb 3 X rrfi - ij p b 3 X rPt o 
A 2 ,i = k(b 1 + b 2 + b 2 Xp fi + U 3 Xp fi + 2b 3 Xpp fi + b 3 (X ppfl + Xp fi 2 )) 
A 2j2 = k (b 2 X r fi + 4 b 3 X r fi + 2 b 3 X rP: o + 63 (2 X Tp $ + 2 X T $X p fi)) — 

1p {h + b 2 + 2 b 2 X P:0 + 8 63 X P: o + 4 b 3 X ppfi + b 3 (3 X ppfi + 3 X pfi 2 ) ) 



b\ + b 2 X p fl + 63 (Xppfl + Xp : o 2 ) 

B = 



However, finite systems of larger dimensions are needed to obtain accurate solutions for the mean 
and standard deviations of the molecule numbers. Large systems of equations are easily generated 
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with symbolic software like Maple or Mathematica; however, these equations are too large to be 
displayed within the article. However, using numerical values (jll.24|) and G = 30 we can go further 
and display the final results. The stationary point solution up to the second order 

X rfi = 20.251, X pfi = 10.125, X rrfi = 0.909, X rpfi = -0.935, X ppfi = -0.468 (11.29) 

shows that the mean value for mRNA is about 20 molecules and the protein number is about 10. 
The factorial cumulants of higher order have absolute values smaller that the corresponding factorial 
moments. For example F rr p would be of order 20(20 — 1) = 380 whereas X rT) o is about 1. 

In the spirit of control theory, the solution to (|II.28f) can be written as an input-output relation 
using the Laplace transform of the X m> \ variables: 



0.50 (s + 5.93) (s + 2.50) (s 2 + 5.94 s + 11.9) 
Xp ' l(s) = ( s 2 + 3.0 s + 2.62) ( S 2 + 4.14 s + 6.87) (s 2 + 10.2 s + 29.0) 9{s) (IL30) 

0.0075 (s + 65.9) (s + 2.04) (s 2 + 10.2 s + 29.8) 
pp ' 1 ^ ~ (s 2 + 3.0 s + 2.62) (s 2 + 4.14 s + 6.87) (s 2 + 10.2 s + 29.0) 9 ^ S ' 

Thus the mean and the fluctuation of the protein number can be directly related with the input 
signal g(s) which is the Laplace transform of git)- 



E. Michaelis-Menten Amplifier 



Catalytic enzymatic processes, like phosphorylation, are fundamental for biological processes. 
The process requires a substrate S reacting with an enzyme E to form a complex C which in turn is 
converted into a product P and the enzyme E, Fig. 10. In a test tube, the reaction proceeds in one 
direction, that is fe_i = and k—2 — which is a special case of Fig. 10. However, it is possible that 
in a cell a more general scheme where k-\ ^ and /c_2 7^ can take place |4^|; thus we study the 
case in Fig. 10. The substrate S is usually supplied in large quantities compared with the enzyme E, 
and the goal of the process is to transform the substrate S into the product P. We choose an input 
oscillatory signal generator to act on the enzyme E. Then, we follow the signal through the complex 
C to the output product P. It is possible to drive large oscillations in the product P using small 
oscillations in the enzyme E. In this case the catalytic process behaves like a molecular amplifier. 
This situation is analogous with how a transistor amplifies the input signal on its base. A constant 
voltage source is necessary to supply the energy for the electrical amplification. Here the role of 
the source is played by the substrate S, the signal in the transistor's base by the enzyme E and the 
output signal from the transistor's collector by the product P. The state is q = (E, S, C, P) and the 
transition probabilities are polynomials in the state variables, Table 4. The molecular diagram, Fig. 
11, depicts all possible transitions in the system and which variables control these transitions. 



k l k 2 
E + S ~ — ^ C ~ — E + P 




FIG. 10: Catalytic reaction. 
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TABLE 4: Michaelis-Menten Process 



ci = (1,0,0,0) 


Ti = G(t) 


e 3 = (-1,-1, 1,0) 


T 3 = fciS • 5 


e_! = (-1,0,0,0) 


T-i = 7b -E 


e_ 3 = (1,1,-1,0) 


T_ 3 = 


e 2 = (0,1,0,0) 


T 2 = K S 


£ 4 = (1,0,-1,1) 


T 4 = fc 2 C 


e_ 2 = (0,-1,0,0) 


T-2 = IsS 


e_ 4 = (-1,0,1,-1) 


T_ 4 = fc_ 2J B • P 


e_ 5 = (0,0, 0,-1) 


T-s = j P P 








FIG. 11: Molecular diagram for Michaelis-Menten process. 

The transition probabilities will generate a stochastic process described by the time evolution of 
its factorial cumulants: 

d t X = k\{zc - z E z s ) (d ZEZS X + d ZE Xd zs X) + k-t(z E zs - z c )d zc X + k 2 (z E z P - z c )d zc X 
+G(t){z E - 1) + nE {l - z E )d ZE X + i ls (l - z s )d zs X + VP (1 - z P )d Zp X 
+h- 2 {zc ~ z E z P ) (d ZEZp X + d ZE Xd Zp X) + K s {z s - 1) • 
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The term G(t)(z E — 1) represents the time variable input generator which acts on the enzyme 
E, whereas K${zs — 1) comes from the constant production of the source S. An advantage of the 
variable X(z,t) is that the generator terms do not contain the variable X(z,t). This will translate 
into equations with constant coefficients for the X m variables: 

X E = -ki {X ES + X E X S ) + k.iXc + k 2 X c + G(t) - j E X E (11.31) 

-k- 2 (X EP + X E X P ) , 
X s = -h {X ES + X E X S ) + k-iXc - j s X s + K s , 

X C = h (X ES + X E X S ) - k-xXc - k 2 X c + k-2 (X EP + X E X P ) , 
X P = k 2 X c - ipXp - k- 2 {X E p + X E X P ) , 
X EE = -2 ki (X EE X S + X E X ES ) + 2 k^X EC + 2 k 2 X EC - 2 j E X EE 

—2 k_ 2 (X EE Xp + XeXep) , 
Xes = —k\ (Xes + XeX$) — k\ (X E sXs + XeXss) — k\ (XeeXs + XeX E s) 
+k^X c + K- X Xsc + k^X EC 

+k 2 Xsc — IeXes — IsXes — k- 2 (XesXp + X E Xsp) , 
Xec = —k\ (XecXs + XeXsc) + k\ (XeeXs + XeXes) 
+k- 1 X C c ~ k-iX E c + k 2 X C c ~ k 2 X E c 

—IeXec — k-2 (XecXp + XeXcp) + k- 2 (X EE Xp + XeXep) , 
X E p = -h (X E pX s + X E X S p) + k^iX C p + k 2 X c + k 2 X C p + k 2 X E c ~ 1eX E p 

—IpXep — k_ 2 (Xep + XeXp) — k-2 (XepXp + XeXpp) 

—k- 2 {XeeXp + XeXep) , 
Xss = -2 h (X ES X S + X E X S s) + 2 k-iX S c - 2 isXss, 
Xsc = —K\ (XecXs + XeXsc) + k\ (XesXs + XeXss) + k-iXcc 

—k^iXsc - k 2 Xsc — isXsc + k-2 (XesXp + XeXsp) , 
Xsp = —ki (X EP X S + X E X S p) + k-\X C p + k 2 Xsc - IsXsp - lpX$p 

—k- 2 (XesXp + XeXsp) , 
Xcc = 2 k\ (XecXs + XeXsc) — 2 k-\Xcc 

-2 k 2 X C c + 2 k^ (X EC X P + X E X CP ) , 
Xcp = k\ (XepXs + XeXsp) - k-\Xcp — k2Xcp + k 2 Xcc 

—IpXcp + k- 2 (XepXp + XeXpp) — k-2 (XecXp + XeXcp) , 
Xpp = 2k2Xcp — 2~/pXpp — 2k-2 (XepXp + XeXpp) . 

The generator G(t) = G + g(t) will determine a stationary state by G and a time variation by 
git). For an oscillatory input of the form 

G(t) = G + Gcos(u;t) (11.32) 

and for the following numerical coefficients 

{ 7P = I, K-2 = l,Ki = 0.1, K 2 = 6, Ks = 50, G = 300, 7s = 0.003, K- X = 0.1, lE = 50} , 
the stationary state is 

{X E , = 6.00, X s ,o = 92.71, X c>0 = 58.14, X Pfi = 49.72, X EEi0 = -0M,X ES , = -.88,X EC , = 
0M,X EPfi = 0.76, Xss,o = 12.28, X sc ,o = -7.82,X 5 p, = -l.23,X CS ,o = 3.10,X CPi0 = 
3.60,X PPi o = -2.33}. 
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The stationary state was solved up to the second order cumulants using the equations 1)11.31(1 . 
In the first order, the time variation of the X m variables is X m (t) = X m> o + X m i(t), (|II.22|) . Each 
variable X m i(t) will contain an oscillatory component A m (uj)e tujt . As a function of u, and for the 
parameters considered above, the ratio between the amplitude of the protein oscillations and the 
amplitude of the enzyme oscillations is presented in Fig. 12. The maximum of this ratio is 3.71 at 
u = 3. 
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FIG. 12: Protein amplification, 



A Pade approximation about uj = 3 for the amplitudes' ratio is 



\A p {uj)\ 106.5w + 1.0 
\A e (uj)\ ~ w 2 + 22.7a; + 9.2 

The frequency to is measured in [time] -1 , see section C. Thus, the enzyme oscillations are ampli- 
fied as they pass to the product P, Fig. 12. So far, we analyzed stationary solutions and stationary 
periodic regimes. Another question to address is how the system behaves in a transitory regime. 
We did 500 Monte Carlo simulations for the system that starts at the time t = from the zero 
initial conditions (all molecule numbers are zero). As the energy is pumped into the system by the 
signal generator G(t) = G + Gcos(wt), the molecule numbers will grow towards a stable periodic 
state. The transitory process is an oscillatory variation superimposed on a growing exponential 
trend, Fig. 13. The numerical solution of the equations (|II.31|) . show that the simulated data are 
explained by the dynamical equations. The error measured by the L2 norm of the difference between 
the simulated data and the computed data is 1% of the norm of the simulated data. Thus we can 
work with numerical solutions to (|II.31|) instead of using Monte Carlo simulations. Moreover, on 
1)11.31(1 we can apply a different analytical approximation (i.e. harmonic balance, expansion in a 
small parameter) which can capture the behavior of the system as a function of its parameters. We 
observe, Fig. 13, that the product oscillates in antiphase with respect to the enzyme, a phenomenon 
also present also in a basic transistor amplifier. 



(11.33) 
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Enzyme 




time 



FIG. 13: Transitory regime: Numerical solutions of the equations l|II.31|) agree with Monte Carlo simulations. 

F. E2F1 Regulatory Element 

The systems studied in the preceding paragraphs were excited by one signal generator. There 
are many cases in living organisms where a gene is regulated by more than one signal. In what 
follows we will study a regulatory element inspired by the E2F1, a member of the E2F family of 
transcription factors ^J. In addition to its established proliferative effect, E2F1 has also been 
implicated in the induction of apoptosis through p53-dependent and p53-independent pathways 
j42j ]. The components of the E2F1 system were described in j4^|. The mRNA level, Fig. 14, is 
regulated by 3 transcription factors E2F1, pRb and DPI. The regulation is done by the dimer 
E2FLDP1 which, for short, is denoted by the letter a. This dimer binds to the DNA and its effect 
is to increase the rate of transcription. The second control of transcription is from the complex 
a:pRB, that binds to the DNA and repress the transcription. The state of the system is thus an 
8 component vector q = (E2F1, DPl,pRB,a,b,c,d,mRNA). This example shows that a state is 
not just a list of different species of molecules. The same dimer E2F1:DP1 is present in the state 
vector as a component a unbound to DNA and a state component c when the dimer is bound to the 
DNA. These are 2 different situations of the same molecular species and we treat them as different 
components of the state of the system. In the diagram of Fig. 14, we use an oval glued to a rectangle 
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TABLE 5: E2F1 Regulatory Element 



ei = (1,0,0,0,0,0,0,0) 


T ei = Gi(t) 


e_i = (-1,0,0,0,0,0,0,0) 


T £ _ 1 = fc_i £2F1 


e 2 = (0,1,0,0,0,0,0,0) 


T C2 =Ga(t) 


e_ 2 = (0,-1,0,0,0,0,0,0) 


T e _ 2 =k- 2 DPl 


e 3 = (0,0,1,0,0,0,0,0) 


r £3 = G 3 (t) 


e_ 3 = (0,0,-1,0,0,0,0,0) 


T f __ a =k_ zP Rb 


e 4 = (-1,-1,0,1,0,0,0,0) 


T £4 = k 4 E2Fl ■ DPI 


e_4 = ((1,1, 0,-1, 0,0, 0,0) 


T £ _ 4 = fc_4d 


e 5 = (0,0,0,-1,0,1,0,0) 


T £3 = k 5 a (m - n 2 c - n 3 d) 


e-5 = (0,0,0,1,0,-1,0,0) 


r £ _ 5 = fc_ s c 


e 6 = (0,0, -1,-1,1,0,0,0) 


T es = k 6 a ■ pRb 


e_ 6 = (0,0,1,1,-1,0,0,0) 


r £ _ 6 = fc_ 6 & 


e 7 = (0,0,0,0,-1,0,1,0) 


T £ _ = krb (mi — rn 2 c — m%d) 


e_ 7 = (0,0,0,0,1,0,-1,0) 




e s = (0,0,0,0,0,0,0,1) 


T £fi = k a c(h-l 2 d) 


e_ 8 = (0,0,0,0,0,0,0,-1) 


r £ _ s = fc_ g r 



to graphically depict the DNA-bound form of a transcription factor. The transition probability that 
controls the mRNA production is T eg = k%(l\ — fad), Table 5. The transition es is upregulated by 
c so T t8 is proportional to c. The repression by d is described by the term fa — fad chosen to obey 
two criteria: (1) it decreases as d increases, and (2) is positive so the transition probability will be a 
positive number. A specified set of parameters together with a solution to the equation of motions 
are realistic if the transition probabilities stay positive or zero for any instant of time. One control 
line starting from c ends with an arrow on the transition es which signifies that c upregulates the 
transcription. This control line corresponds to the term kgfac from T eg . Another control line starts 
from c and d and ends in a short bar, thus repressing the mRNA transcription. This line represents 
the term —kgfacd from T t8 . Variations in the number of c and d molecules will depend on variations 
in E2F1, DPI and pRB. Thus we will insert 3 signal generators G\(t), and G$(t) to modulate 

the levels of E2F1, DPI and pRB respectively. The transitions ej, with i = 1,2,3, represent these 
input generators. The degradation transitions e_j, i = 1,2,3, are taken to be proportional with 
the number of molecules being degraded. The formation of the dimer a is described by €4 which 
is controlled by a nonlinear transition probability (the product of E2F1 with DPI). The transition 
€5 represents the binding of the dimer a to DNA. That is the creation of one c from one a, as is 
readable from the components of the transition €5 . The transition probability for this binding event 
is proportional to the number of a dimers and with the free space available on the DNA. This free 
space available for DNA binding should be of the form n — c — d where n is the maximum number 
of proteins that can bind to DNA to regulate the transcription. We subtract from n the space 
already occupied which is c + d. In order to cover the situation when the binding properties of c 
and d are different we use T e5 = k^a{ni — n%c — n%d), with n\, n% and 713 some constant coefficients. 
Then, the transition e% is like £4 and £7 like €5. The transitions e_«, i = 4,5,6,7, represent reverse 
processes. The equations are more compactly written if we index the state by integer numbers: 
qi = E2FI, f/2 = DPI, qs = pRb, q^ = a, q§ = b,qe = c, q? = d,q$ = mRNA. The time evolution for 
X(z, t) is given by: 




FIG. 14: Molecular diagram for E2F1 regulatory clement. 
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d t X = d (t) ( Zl - 1) + k-x (1 - z x ) dtX (11.34) 

+ G 8 (t)(z 2 -l) + k~ 2 (l-z 2 )d 2 X 

+ G 3 (t)(z 3 -l) + k_ 3 (l-z 3 )d 3 X 

+ k 4 (z 4 - z\z 2 ) {di 2 X + d\Xd 2 X) + k_ 4 {z\z 2 - z 4 ) d 4 X 

+ k 5 (z e - z 4 ) (nidiX - n 2 z e (d 46 X + d 4 Xd 6 X) - n 3 z 7 (d 47 X + d 4 Xd 7 X)) 

+ fe_ 5 (z4 - z 6 ) d 6 X 

+ k e (2:5 - z 3 z 4 ) (d 34 X + d 3 Xd 4 X) + /c_ 6 {z 3 z 4 - z 5 ) <9 5 X 

+ fe 7 (z 7 - z 5 ) (midr X - m 2 z e {d 56 X + d 5 Xd 6 X) - m 3 z 7 (d 57 X + d 5 Xd 7 X)) 

+ k^ 7 (z 5 - z 7 ) d 7 X 

+ k 8 {zezg - ze) (hd 6 X - l 2 z 7 {d 67 X + d 6 Xd 7 X)) 

+ k^ 8 (1 - z 8 ) d 8 X . 

The equations for the mean of the state components are: 

Xi = Gt (t) - k_iXi - k 4 (X 12 + X X X 2 ) + k- 4 X 4 , (11.35) 
X 2 = G 2 (t) - k. 2 X 2 - k 4 (X 12 + X±X 2 ) + k. 4 X 4 , 
X 3 = G 3 (t) - k_ 3 X 3 - k 6 (X 34 + X 3 X 4 ) + k. 6 X 5 , 

X 4 = k 4 (X l2 + X X X 2 )- k_ 4 X 4 - k 5 ( ni X 4 - n 2 (X 46 + X 4 X 6 ) - n 3 (X 47 + X 4 X 7 )) 

+ k. 5 X 6 - k 6 (X 34 + X 3 X 4 ) + k_ 6 X 5 , 
X 5 = k 6 (X 34 + X 3 X 4 ) - k^X b - k 7 (miX 5 - m 2 (X 56 + X 5 X 6 ) - M 3 (X 57 + X 5 X 7 )) + k_ 7 X 7 , 
X e = k 5 ( ni X 4 - n 2 {X m + X 4 X 6 ) - n 3 (X 47 + X 4 X 7 )) - k- 5 X 6 , 
X 7 = k 7 (miX 5 - m 2 (X 56 + X 5 X 6 ) - m 3 {X 57 + X b X 7 )) - k_ 7 X 7 , 
X 8 = k 8 (hX 6 - l 2 (X 67 + X 6 X 7 )) - k. 8 X 8 . 

As in the previous examples, the system of equations is infinite because in the equation for an 
X m , variables X m i with higher order indices m' > m are present. For example, to compute the 
standard deviation of the the mRNA number, we need to include the following equation in the 
system 

X 88 = 2 k 8 {l\X§ 8 — l 2 (Xq 78 + XqX 78 + X§ 8 X 7 )) — 2 k- 8 X 88 , (11.36) 

which contains the third order variable ^678- Because the state has 8 components, there will be 
a total of 164 equations for the X m (t) variables, up to third order in m (|m| < 3). We will disregard 
any fourth order variables in the following, so that we will work with a set of 164 equations. Again, 
the constant part of the generators G\,G 2 and G 3 will fix a stationary state. The equations for 
the stationary point are polynomial in X m ^, and thus can be solved by using one of the existing 
algorithms for polynomial systems of equations The system is large and many unphysical 

solutions will be generated by solving it directly. A strategy to obtain the desired solution is to 
use at the beginning only the equations for the first order cumulants where we set all higher order 
factorial cumulants equal to zero. This partial solution will be used as a starting point for finding 
a solution for the entire 164 equations. For the following set of parameters: {G\ = 340, G 2 = 
275, G 3 = 2.86, k 4 = l,/c_ 3 = 0.13, fc_ 6 = 3.025, k_ 4 = 10, k 5 = 14,h = 60,k_ 7 = 7.773, fc_i = 
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17, k_ 2 = 11, k 8 = l,k 6 = 0.11, fc 7 = 0.011, k_ 8 = 22.800, /c_ 5 = 9275, n x = 60, n 2 = 1, n 3 = 1, mi = 
60, ?«2 = 1,7713 = 1, l\ = 60, ?2 = 1}, the stationary point for the mean values of the state is 
{A 1>0 = 20,A 2>0 = 25,X3, = 22,X 4 ,o = 50,A 5i0 = 40,X 6>0 = 4,A 7>0 = 3,X 8fi = 10}. We notice 
that the constraints imposed upon the transition probabilities were effective in the sense that the 
values for the molecules c and d that bind to the DNA are much less than the values for the unbound 
molecules a and b. Out of the 164 values for the X m fl at the stationary point, the maximum value 
is X^gfi = 6.88 which expresses the correlation between the mRNA and the dimer a. The minimum 
negative number is X^,o = —0.73, for the correlation between mRNA and the DNA-bound molecule 
d. The generators will modulate the mRNA level through their time dependant term gi(t),i = 1,2, 3. 
The effect of these input variations can be computed like we did in the previous examples, using an 
expansion in the small parameter rj. Then the mRNA level will vary according to 



X 8>1 (s) = Hl( S ) gi (s) + Hi(s)g 2 (s) + Hl(s)g 3 (s) , (11.37) 
A 88jl (s) = Hl 8 (s) 9l (s) + Hl 8 (s) g2 (s) + Hl s (s)g 3 (s) , 

were the transfer functions are 

s 0.06 s 2 + 0.65 s + 0.87 TT , , . 0.0002 s 2 + 0.0228 s + 0.0340 

H 8( S ) = 71 I i 7Z i i ao\ I, \ i oe\ » ^88l s ) 



(s + 1.57) (s + 1.48) (s + 1.38) ' 8SV ; (s + 80.9) (s + 1.57) (s + 1.48) ' 

0.01 s 2 + 0.18 s + 0.26 „, , s 0.01 s 2 + 0.14 s + 0.29 



Hs ^ (s + 1.57) (s + 1.48) (s + 3.79) ' (s + 1.57) (s + 11.30) (s + 3.01) 

o , . -0.16 s 2 -2.16 s -0.52 o /N 0.006 s 2 - 0.066 s - 0.023 



(s + 0.043) (s + 1.48) (s + 22.8) ' 88V ' (s + 1.57) (s + 13.08) (s + 0.043) ' 

The transfer functions obtained from all 164 equations are actually rational functions of much 
higher powers than those above. The inverse Laplace transform of these solutions consists of a sum of 
many exponential decaying components. Some components are very small, so they can be neglected. 
We did an exhaustive search through all groups of 3 components to find the best approximation. 
The error was computed as the ratio of the L 2 norm of the difference between the solution and 
its approximation, over the L 2 norm of the solution. The average error for the formulas above is 
6%. Using more than 3 decaying components to approximate the transfer function will decrease the 
error, but will increase the complexity of the rational function. Prom the expression of the transfer 
function H 8 , we see that the pRb acts with a negative sign, as was expected. However, the relative 
strength of the action of each signal generator on the mRNA level can not be assessed unless we 
solve the dynamical equations. 



III. TIME EVOLUTION EQUATION FOR A GENE REGULATORY NETWORK 

EXCITED BY SIGNAL GENERATORS 

The systems studied in the previous sections proved the usefulness of the equations for the 
variables X m (t) and showed how to use these equations to solve practical problems. Now we aim to 
write the time evolution equation for X m (t) for a general stochastic nonlinear regulatory network. 
Nonlinear refers here to transition probabilities that are rational functions in the state variables. A 
genetic regulatory network is represented by a vector q = (q±,..., q n ). Each qi represent the number 
of molecules for the component i of the state q. The components i = 1 . . . n can represent different 
proteins, mRNAs, or the same protein but in different configurations or localizations (in nucleus, 
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on the membrane, in Golgi apparatus etc.). The state components qi change in time due to a set 
of possible transitions e. If at time t the state is q, then at time t + dt the state will be q + e, with 
e one of the possible transitions. Each transition is governed by its probability of transition T e (q,t) 
which depends on the state variables q and time t. To express this state dependance we need some 
notations. 

A vector m = (mi,m2, ...,m n ) will index a set of polynomials which will constitute a basis for 
the set of all polynomials in q: 

em(q) = e mi (qi)-e mn (q n ) ( IIL1 ) 

with 

e-m k {qk) = Qk{qk - !)•■•(% ~m k + 1). (III.2) 

The e mk is known as the falling factorial and is related with the Pochhammer symbol. We will refer 
to ejn{q) as the factorial basis. Each vector m can be written in a tensor notation 

m = 11...122...2... nn^n (III.3) 

mi m 2 m n 

and each tensor index m can be transferred into a vector notation fn. The modulus of m or m is 
the degree of the polynomial: \m\ = \m\ = X^fc=i m k- 

To find the standard deviations or moments of the state q, the transformation between the 
factorial basis and the basis {1 . . . } is helpful 

l 

x l = Y,S(l,k)e k (x) , (III.4) 

k=0 

where S(l, k) are the Stirling numbers of the second kind. The inverse transformation depends 
on the Stirling numbers of the first kind s(l, k) 

l 

ei { x ) = ^s{l,k)x k . (III.5) 

k=0 

In general, 

n n rrii m n 

i=l i=l ki=0 fc=0 * = 1 

therefore 

m 

< e-M) >= s(mx,ki)s(m 2 , k 2 )...s(m n , k n ) < qfq^...q k n n > (111.7) 

k=0 
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and 



< q\ ni qT 2 -Qn n >=^S(m 1 M)S(m 2l h 2 )...S(m n ,k n ) < ej(g) > . (IIU 

With the help of the factorial basis we can write 3 types of transition probabilities, 
linear transition: 



polynomial transition: 



and rational transition: 



T e (q,t) = ^2M^t)q k , (III. 



T e (q,t)=Y / M?(t)e w (q) , (111.10) 



To keep the components of the state q positive or zero the transition probabilities should obey 
some boundary conditions. The state with at least one component set on zero is on the boundary 
of the set of all possible states. On the boundary, some of the e's will point outside, and there is 
a danger that the system will jump into a state with at least one state component with negative 
molecule numbers. To avoid the unphysical negative states, we will impose boundary conditions on 
the transition probabilities. Consider a system that is in a positive state at t = 0. This system will 
never jump in a negative region if T e (q,t) = for q^ such that qt + €i < —1, i = 1, ... , n. The last 
inequality can be expressed in terms of m if we look at the structure of the transition probability 
T £ (q, t) = Ylm M™{t)ern{c[) an d roots of ejn(q)- Namely, the condition q^ + ej < — 1, i = 1, . . . , n 
is fulfilled if 



m>-e (111.12) 

for every m in T e (q,t) with M™(t) ^ 0. The boundary condition (jIII,12|) for rational transition 
probabilities refers to the numerator of 

We will first deduce the time variation equation for the polynomial transitions: 



T e (q,t)=Y,M?(t)e m (q). (111.13) 
m 

To excite a gene regulatory network, an experimental scientist must act on it through a set of 
signal generators. The signal generators are present in the coefficients M e m (t). The most obvious 
way to introduce a signal generator is by adding it to a transition probability 

T t (q, t)=J2 M™e m (q) + G{t) . (III. 14) 

m 

Written in the factorial basis the generator is G(t) = M®(t)eo(q). This type of signal generator 
can be implemented using a light switch to control the promoter of a gene, Fig. 2. A different 
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type of signal generator modulates protein degradation. An experimental implementation of such a 
generator will have a tremendous impact on the protein function prediction. The influence of such 
a generator can be written as 

T e (q, t) = J2 M^emiq) + G{t)en{q) (III. 15) 

m 

where G(t) = M™(t), with n > — e to satisfy the boundary conditions. 

The equation for the probability of the gene regulatory network P(qi, ...q n ,t) to be in the state 
q at time t is 



dP(q,t) 

at 



T e {q - e, t)P(q - e, t) - £ T e (q, t)P(q, t) . (111.16) 



We are interested in the mean values for different molecules, as well as their correlations. The 
generating function for such variables is F(z,t) which is the iJ-transform of P(q,t): 

F(z, t) = Z(P(q, t)) = z q P(q, t) . (IIL17) 

1 

Here z = (z\ , . . . , z n ) and z q = z^ . . . Zn n ■ 

Taking care of the boundary condition (jlll. 12f) . the equation for the variable F(z,t) is 

B F 

— = J2(* e+m - z m )M™(t)d m F (111.18) 

where we used the property that 

Z(e m (q)P(q, t))) = z m d m F(z, t) . (111.19) 

The variables that describe the dynamic of the system are generated by taking partial derivatives 
of F(z, t) with respect to z. For this process the following relation is useful: 

d a (z e ) = Q a (e)z e -« , (111.20) 

with 

Q a (e) = e a . (111.21) 

In what follows, the Greek letters will refer to a one dimensional index that runs from 1 to n. The 
Latin letters will refer to a tensor index m = aia^a^"- • Then d a p(zl 1 ...z% 1 ) = dp(Q a (e)z e ~ a ) = 
Q a/3 (e)z e -«-P, with Q aP {e) = Q a (e)Qp(e -a). 
In general 

dmZ' = QMZ^ , 

with 

Qmo(e) = Q m (e)Qtt(e - rn). 
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From (|111.18|) we obtain 

d a F = ^(Q a (e + m)z e+ " - Q a {m)z")M^(t)d m F + ^(z £+ ™ - z w )M™(t)d ma F 
and for z = 1 

F a = ^(Q Q (e + m) - Q a (m))M™(t)F m . (111.22) 

With the compact notation 

Kit) = ]T(Q a (e + m) - Q Q (m))M £ m (t) , (111.23) 

e 

the equation (|III.22|) becomes 

F a = R™(t)F m , (111.24) 

where summation over the dummy index m is implied. 
Similarly: 

Fa/3 = Kp{t)F m + K{t)F mP + R^(t)F ma , (IH.25) 

with 

Kpit) = ^(Q a p(e + m) - Q a p{m))M™{t) . (III.26) 

e 

The general equations for the F rn variables are obtained by applying the operator d ai ,, Mn to 
(|III.18|I . We write the action of this operator on a product of two functions as: 

9 ai ...a n (fg) = {d ai ...a k fda k+1 ... an g} a , (111.27) 

where the braces indicate the summation for all pairs of disjoint sets {at\ . . . ak),(ak+i ■ ■ ■ a n) 
that form a partition of the tensor index ot\ . . . a n . When listing all possible partitions, we take care 
that a permutation of the elements of a set does not change said set. Also R m with an empty index 
set is zero, because Q(e) = 1, which comes from z e = Q(e)z e , see (1111.201) . 

Then the equation for the time variation of F m {t) is 

F ai ...a n = {R™ 1 ...a k {t)F mak+1 ... an } a , (111.28) 

where summation over the dummy index m is implied. The tensor R a ^ 1 ___ ak (t) is given by pil.26|) 
with a\ . . . a.k instead of the index a(3. To see the structure of the equation (jIII.28|) we specialize it 
for for n = 3. 

-^010203 = ^aia2"3 (f)F m (III. 29) 

+^qiq 2 {t)F m a- A +^tt 1 a 3 W^ma2 + -^a 2 a3 (t)F mai 
~^~^ , ai(.^ S )Fma2a3 ~i~ -^a 2 (fyFma\az ~\~ (^)-Pmaia2 ■ 
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A. Factorial Cumulants and Filled Young Tableaux 

The equation (|III.28|) has a similar structure with what is called a bilinear system in Nonlinear 
Control Theory. A bilinear system is represented by a time evolution equation that is linear in state, 
linear in control, but not jointly linear in both 



^ = Ax + N k u k x + Bu , (III. 30) 

k=l 

where x € M n and A, N k , k = 1 . . . n, B are appropriate matrices [43 |. The controls are u k ,k = 
1 . . . n and the state is described by x. However, the system (|III.28|I is not finite like ()III.30|) . and as 
we explained in the first section, discarding F m with higher order m will produce an unstable finite 
system of equations. Our goal is twofold: (1) to change the variables F m so that the discarding 
process for obtaining a finite number of equations become meaningful and (2) to keep the bilinear 
structure of the equations in the new variable. 

The change of variable: 



F(z,t) = e x{z ^ (111.31) 

is similar with the change from moments to cumulants jjjfj]. Because F(z, t) generates the factorial 
moments, X(z,t) will generate the factorial cumulants. 

The time dependent variables, F m , will be replaced by X m = d m X(z, t) \ z= \. The transformation 
relations between F m and X m follow from the Faa Di Bruno's formula for the derivative of the 
composition of functions To keep the bilinear structure, we need to construct an appropriate 
index notation for the terms of the Faa Di Bruno's formula. We introduce the index construction 
by way of an example. The fourth order derivative of F at z = 1 is 

FaPiS = X a/3j $ + (III. 32) 

X a f3-yX$ + X OL ^X 1 + X ai $Xp + X^sXa + 

XafiX^s + X al Xp$ + X a sX/fy + 

XapXyXs + X al XpX$ + XafiXpX^ + XpryXaXg + Xp&X a X 1 + X 1 $X a X/3 + 

X a XpXyXs 

Given the index af3^5 from the left side of (jIII.32|) we need to generate all the indices that 
appear in the right side of (|III.32|) . If the term X a ^XpX^ is present in the sum, then any symmetric 
version of it, like XgXj a Xs, cannot be present. If we classify all possible symmetries of a term, then 
we will find an index notation that will eliminate all the equivalent terms. The symmetries come 
from the commutativity of the product and the commutativity of the partial derivatives. Young 
tableaux and filled Young tableaux will help to classify the symmetries and also to construct an 
index notation which will keep the bilinear structure. A Young tableau is associated with a partition 
of an integer N. For a fixed positive integer N, to each partition N = fi\\i + /U2A2 + ... + Hk^k, 
Ai > A2 > ... > Afc > 0, we associate an empty Young tableau consisting of //, rows of Aj empty 
boxes, % = 1, .., k. For example consider the partition 8 = 3 + 2 + 2 + 1 = 3 + 2- 2 + 1, that is 
Hi = 1, Ai = 3, fi2 = 2, A2 = 2, H3 = 1, A3 = 1. The Young tableau corresponding to this partition is 
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We will use these Young tableaux filled with indices a, (3, 7, to represent products of X n 
variables: 

z X a ^Xs 



a 


P\l\ 


5_ 





Then (|III.32j) is written using filled Young tableaux as indices as follows: 



■I ■II 'I 
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r-i-i + X 
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n-i + X 
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+ x 
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The rows of a Young tableau are listed in decreasing order of their length, which will enforce 
an order in the product of the variables X m . Thus the symmetry due to the commutativity of the 
product is lifted. However, in a block of rows of equal length there is still an ambiguity in ordering 
the terms in a product. To lift the ambiguity, we will order rows of equal length in decreasing 
lexicographic order of the words that are placed in rows. For example, in the 6th term of the above 
formula a(3 > ^5 and thus afi is placed above 7<5. The lexicographic order between the tensor indices 
is induced by the order of the components of the state: qi -< qj if i < j. There is one more symmetry 
left to be lifted. This symmetry is generated by the commutativity of the partial derivatives and is 
lifted by ordering the letters in a row from left to right. For example, the 4th term in the formula 
above has a^5 on its first row and not "fa5. Mathematically, the symmetries are described with the 
help of a set of permutations that act on the tensor index that fill a Young tableau. The components 
of a tensor index m will be denoted using superscripts not to be confused with the components of 
the vector fn; thus m = m 1 m 2 m s .... A Young tableau Y filled with a tensor index m from F m is 
denoted by Y[m]. The filling process starts from the upper left box of Y where m 1 is inserted, and 
moves from left to right and top to bottom. The action of a permutation a on the elements of Y[m] 
is denoted by Yfm "] and is exemplified below: 



Y[m] = 


m 1 


2 

m 


m 


Y[m a } = 


m tr(l) 


m <r(2) 


m cr(3) 




4 

m 








m °-(4) 





For example, if Y= [^J I ^ , m = afi^5 , <r(l) = 1, <r(2) = 2, cr(3) = 4 and <r(4) = 3 we have: 
Y[m] = M £bJ Y[m' 7 ] 



Q 




7. 
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and 

^Y[rn] = Xafi^Xs (III. 33) 

-^Y[m CT ] = XapsX-y . (III. 34) 

To lift the last symmetry, we need to find the permutations a which leave the term Xy[ m ] 
invariant, that is Xy^m"] = Xy[ m y In general, the components m l of the tensor index m need not 
be distinct. For example we have m=rpp in (jII.18j) . However, to obtain the Faa Di Bruno formula, 
when we solve for a in X Y [ m <r] = X Y \ m ], the index m must have distinct components (m 1 7^ m 3 for 
all i 7^ j). The set of permutations a thus found form a subgroup of the permutation group Sry\- 
Here \Y\ is the dimension of the Young tableau Y which equals the total number of its boxes. This 
subgroup is denoted as H Y . 

The terms in the Faa Di Bruno formula will be generated using filled Young tableaux and a set 
of representative permutations a% , i = 1 . . . J chosen form each set of the coset space 

S m /H Y = {aiH Y , ajH Y } . (111.35) 

Here J is |Y|!/cardinal(i7 y ). The lexicographic order is a practical method to select a set of 
representative permutations a\, . . . , aj, without computing the invariant subgroup H Y and the coset 
space; we will use the lexicographic order for simple cases. However, for general results we will use 
the coset space. Some examples of invariant subgroups and coset spaces are presented in Table 4. 



TABLE 6: Subgroups and Coset Spaces 



Y[aj3j6] 


H Y 




1 a 





i\ s\ 




{(1)} 




a 







{(123), (12)} 


{(1), (34), (24), (14)} 




±_ 

a 
1 



S 




{(12),(34),(13)(24)} 


{(1), (23), (243)} 




a 
1 

<s_ 







{(12),(34),(12)(34)} 


{(1),(23),(24),(13),(14),(13)(24)} 




a 

J 

1 








{(1)} 




6 











In the above table we used the cycle notation for permutations: (123) means c(l) = 2, cr{2) = 
3, <r(3) = 1. 
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Finally we can write the Faa Di Bruno formula in Young tableaux notation: 

\Y\=\m\ a£S lY \/H Y 



(111.36) 



Here the tensor index m can have any form; there is no need for the components m 1 to be distinct 
(as it was when we defined H Y ). 

For partial derivatives with z not fixed to 1 the formula is similar 



d m F(z,t)= E d Y[ma] X{z,t)e x ^ , 

\Y\ = \m\ a€S m /HY 



(111.37) 



with the convention that the derivation, with respect to a filled Young tableau, is the product of 
the derivatives along each line of the tableau. Thus, using the example (|III.33|) we have: 



d Y [ m ]X(z, t) = d a d /3 d 1 X{z, t)d 5 X(z, t) 



(ni.38) 



B. Equation of Motion for Polynomial Transition Probabilities 

The equation (|III.18|) in the variable X(z,t) is 



8 t X{z, t) = Y,(z t+m ~ z m )M™{t) 



E Yl d Y[m°\X{z,t) 

\Y\=\m\ <reS ]Y \/H Y 



(111.39) 



The time dependent variables will now be d m X(z, t) \ z =i, so that we must take partial derivatives 
with respect to z of (jIII.39|) . The concatenation notation d a p = d a dp must be generalized for filled 
Young tableaux 



d a \Y{m]X(z,t) := d a (dY[ m ]X(z,t)) . 



(111.40) 



From the definition ()III.40|) . the concatenation a|Y[m] means that a box containing a must be 
glued to each row of Y[m] and the object thus obtained must be rearranged into a lexicographical 
order filled Young tableau. Here is an example of concatenation with a box filled with the index 2: 

M drrnrnn *)) = 9 rrr™^(^ *) + 5 rrn^ X & *) + 9 rr™ X(z, t) + X(z, t) 



1 


2 


3| 


1 


5 




6 


7 




8 







1 


2 


2 1 3 | 


1 


5 




6 


7 




8 







1 


2 


3 


2 


4 


5 


6 


7 




H 







Inductively we define 



1 


2 


3 


2 


6 


7 


1 


5 




8 







1 


2 


3| 


4 


5 




6 


7 




2 


8 





0, 



a\P\...y\Y[m] 



I7I Y[m 



])• 



(111.41) 



The concatenation notation will be also applied to the X m variable: 



Xa\Y[m] '■— d a \Y[ m }X(z,t) 



2=1 
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The equations for the factorial cumulants are now a consequence of (jlll.39|) 



(111.42) 



(111.43) 



Y,<7 



Y.a 



with being a short notation for the the sums over Y and a in (|III.39|) . 

In general 



X, 



aia 2 ...a r , 



R 



(*)X)^" a fe+il-l"»l Y K 



(111.44) 



On the right side of (jlll.44|) there are more types of Young tableaux than the one row tableau 
of the indices from the left side. This is a consequence of the nonlinearity of the system. To 
obtain a closed system of equations, though infinite, we must obtain the time evolution equation 
for XY[ m ] for any type of filled Young tableau Y[m], not only for one row, as in ()III.44|) . The 
equations written in the variables Xyym] will be bilinear, as desired; this procedure is known as 
Carleman bilinearization j4?J. To obtain these equations, we need to introduce the sum of two filled 
Young tableaux. Let Yi[m] and Y^It^] be two filled Young tableaux with corresponding partitions 
N = /iiAi + fi2^2 + ••• + AtfcAfc and N = jliXi + ^2^2 + ••• + Afe^fci respectively. We define their 
sum, denoted Y\ [m] © Y2 [rh] , by interlacing and ordering the rows of Y\ [m] and Y2 [fh] ■ The sum 
corresponds to the partition N + N = /iiAi + ^2^2 + ••• + fJ-k^k + fiiM + ^2^2 + ••• + fik^k- F° r 
example: 



In other words, 



2 


3 


4|6| 


3 


1 


5 


7 1 8 


5 


7 




1 


2 


10 




8 


9 




6 


9 






1 















X, 



2 


3 


4|8| * 


3 


4 


5 


7|8| : ~ 


3 


4 


5 


7 


8 


5 


7 




1 


2 


10 




2 


3 


4 


6 




8 


9 




6 


9 




1 


2 


10 




1 










5 


7 
















6 


9 
















8 


9 
















1 









3 


1 


5 


7 


8| 


2 


3 


4 


6 




1 


2 


10 




5 


7 






6 


9 






8 


9 






1 









2 


3 


4|6| ffi 


3 


1 


5 


7 1 8 


5 


7 




1 


2 


10 




8 


9 




6 


9 






1 















Using (|III.44|) and the definition of Xe where is a filled Young tableau we obtain the equation 
of motion: 



i€Rows(0) 



n 0i6 2 ...0k 



^ ^ X e k+1 \...\e n . 

\Y\=\m\ *eS m /W 



Y\m' T }®e i 



(111.45) 
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The rows of are indexed by i and the indices filling the row i are denoted by 6\ . . . 6 ni . The 
length of the row i of 6 is n^. Here, the summation by the tensor index m is understood. The 
hat above 0* means that the row i was removed from ©. The brace indicates that we have to sum 
over all possible subsets (0i#2---0fc) of the set (6\, 9 ni ). The order of operations in the index of X 
in (jlll.45|) is first concatenation | and then (j^. An example of an equation of type ()111.45j) for the 
Michaelis-Menten system analyzed in a preeceding section follows: 



X 



E 




S_ 





-2kiX 



E 


E\E\ 


S_ 





-2fciX 



E 




S 




S 





-2LX, 



-2&_ 2 X, 



E 


E\P\ 


S_ 





-2fc_2X 



-kiX. 



+k s X [ 



E 




E 




S 




\e\e\ 



-jfeiXf 



+2/c_iX. 



3 



3 



-75X, 



+fc-iX r 



E 


P\ 


E 




S 









E 


E\ 


C_ 





The terms on the right side of the above formula were obtained from (|111,45|) . For example, for 
i = \E\E\ , m = EP, \Y\ = 2, we have 

l E 



-2fc_ 2 X, =R§ P X 



E 


E\P\ 


S_ 




E 


E\ 


S 




P 





2k- 2 X 



e \ E pms 



E I E^[S] 
P 



C. Equation of Motion for Rational Transition Probabilities 

The Master Equation is now 

dP(q,t) f e (q- e ,t) n ,_ ^ f e (q,t) 



at 



e fe{q-e,t) 



fe(q,t 



■P(q,t) 



(111.46) 



where f(q,t) &ndf(q,t) are polynomial functions in the state variable q. 

Multiplying both sides of ()III.46|) with h(q, t) = Y\ e f e (q, t)f t (q — e, t) produces a Master Equation 
with polynomial coefficients: 



h(q,t 



dP(q,t) 
dt 



]T TP (q- e ,t)P(q-e,t)-J2 (?, t)P(q, t) . 



The decomposition in the factorial base will be 



(111.47) 



h(q,t) 

T?(q,t) 



Y,M m {t)e m {q) 

m 

Y,M^{t)e mi {q) 

mi 

Y,M^{t)e m2 {q) 



(111.48) 
(111.49) 
(111.50) 



(ll 2 
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The boundary condition m > — e that applies to f e (q,t) also applies to T^(q,t) and T e 2 (g,i) and 
thus the equation for F(z, t) is 

^2 MT(t)z m d m d t F = £ M^ 1 (t)z €+m 'd mi F - M^(t)z m *d m2 F (111.51) 

EE E 

Change the variable to F(z,t) = e x<yZ,t \ 



MT{t)z m d Y[m^]X(z, t) + X(z, t) dy^Xiz, t) = (111.52) 

E \Y,<T Y,CT J 

^M e m w +mi y, dY l[mi ^x(z,t)-^2Mrnt)z m2 53 d Y2[m2 , 2] x( Z ,t) 

e Yi,ai e Y2,a2 

Take the partial derivative of l|111.52j) with respect to the tensor index a\. . .a n using the general 
formula 

d ai ... an (f9h) = {d ai .. Mhl fda kl+1 .. Mha gd aka+1 .. Mn h} . (111.53) 

The braces indicates the summation for all triplets of disjoint sets (a\ . . . a^), («fci+i • • • «fc 2 ) 
and (afc 2 +i • • • a n) that form a partition of the tensor index a\ . . . a n . 
The time evolution equation for the factorial cumulants is then 

(111.54) 

|<3ai...a fc ("i)^Q fe+1 |...a n |Y[m CT ] + Qai...a kl ( m ) X a kl+l |...a fea \Y [m CT ] X a k2+1 1 ...a n \Y [m°] } = 

{M™ 1 (t)Q ai ... ak (e + m 1 )X ak+1 ^ anlYl[mi . 1] - M'^ (t)Q ai ... ak (m 2 )X ak+ll , Mnmm2 , 2] } a 



where summation over the dummy indices m, m\ and m<i is implied. This equation was used to 
solve the Hill feedback control ()II.18|) . Instead of the Carleman bilinearization, which is difficult to 



apply for this case, we can use the variational approach [4jj and (jll.211 111.22 



IV. DISCUSSION 

In this work, we have extended the analysis carried out in |3(| from linear to nonlinear stochastic 
networks. The genetic regulatory networks are stimulated by a set of signal generators . In an 
experimental setting, a specific set of molecules (mRNAs, proteins) are selected and their time 
variation is controlled by input signal generators. As a consequence, the number of the different 
molecules that comprise the genetic regulatory network will vary in time. The time variation of these 
molecular numbers is subject to a system of equations. We deduce this system of equations for a 
stochastic genetic regulatory network described by a state, a set of transitions and their transition 
probabilities. The nonlinear effects are due to the transition probabilities being polynomial or 
rational functions in the state components. The system being stochastic, the variables of interest 
are the mean and the correlations for the molecular species which comprise the genetic network. 
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The time dependance of these means and correlations are expressed in terms of a set of factorial 
cumulants. The network's dynamic is described by the time variation of these factorial cumulants. 
The time evolution equations take the form (jlll.44|) for polynomial transition probabilities and 
()III.54|) for rational transition probabilities. 

We solved the equation of motion for four genetic regulatory networks. 

The first example aims to further generalize the results of |30(. There, a linear stochastic genetic 
network was analyzed and the equations for the factorial cumulants up to order two were solved. In 
the present paper, we study a nonlinear connection of two linear systems. We arrived at the con- 
clusion that the solution to the nonlinear coupled systems implies factorial cumulants of order more 
than two. The equations for the cumulants of a linear network can be organized in an hierarchical 
structure with respect to the order of the cumulants, see Fig. 3 System 1. The cumulants up to a 
given order form a closed system of equations, which is not the case for a typical nonlinear network. 
However, we also have shown that for the special case of linear systems, that nonlinear coupling 
does not require an infinite system of equations. Indeed, if we need to solve for the second order 
factorial cumulants for System 2, then we need up to forth order cumulants for System 1. 

The second example uses the equations (|III,54|) to study an autoregulatory system that is fre- 
quently used to explain experimental results. The system is composed of one gene which regulates 
its own transcription. The protein acts on mRNA production through a term that is a rational func- 
tion in protein number, see Table 2. We studied this system from a synthetic biology perspective, 
aiming to design a logic gate. The biomolecular device being intrinsically probabilistic, a logic 1 
will be characterized by a mean value and a standard deviation from the mean; similar for the logic 
0. The distance between the mean values of the logical levels should be sufficiently large to include 
the standard deviations of both logic levels. We presented a scheme to design a logic gate from an 
autoregulatory gene, see Fig. 7. From another point of view, the autoregulatory system is useful for 
checking the effectiveness of the factorial cumulants. Namely, the analytical solutions must match 
the results from a Monte Carlo simulation. Such a comparison was done for the case when the 
signal generator is closed, causing the gene transcription to be completely under the control of its 
protein product. Because inside a living cell many regulatory proteins appear in small number, we 
choose the network parameters so that the mRNA number fluctuates around 12 molecules, Table 
3. We found that the traditional mass action equations (|II.25|) do not explain the Monte Carlo 
simulations; to explain the simulated results we had to use equations that involve higher order fac- 
torial cumulants. Next, we study the response of the system to a time variable signal generator. 
The input-output relations for protein are presented in terms of the Laplace transforms of the time 
dependent variables ()II.30|) . 

The stochastic version of the classical Michaelis-Menten mechanism for enzymic catalysis is the 
subject of the third example. An input signal generator acting on the enzyme will drive the source, 
complex and product. The time variation of these molecules is described by the system of equations 
1)11. 31|) . For an oscillatory signal generator, the Michaelis-Menten process behaves as a molecular 
amplifier. Namely, it is possible to drive large oscillations in the product P using small oscillations in 
the enzyme E, Fig. 12. Another aspect that we investigated for this process is its transitory regime, 
Fig. 13. We showed that the dynamical equations pi.31j) and the Monte Carlo simulations agree 
with each other. Thus, we can use the dynamical equations to produce the statistical variables, 
rather than generating many instances of the stochastic process. 

The last example is based on E2F1 regulatory element. The transcription is controlled by three 
transcription factors: E2F1, DPI and pRb. Here we studied the interference of three signal gener- 
ators. Each generator will modulate the mRNA production as is specified by the transfer functions 
from (|II.37|) . The E2F1 regulatory element, as was studied here, is a part of a complex system of 
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many transcription factors. If we see the E2F1 as a module in a complex system, than we have 
to interconnect many modules to obtain the whole complex system. How to decompose a complex 
system into its modules and how to interconnect many modules to obtain a complex system is left 
for a future study. The dynamical equations for all these examples of gene regulatory networks 
are special cases of a general system of equations that were obtained in the last two sections of 
the article. For transition probabilities that are polynomial in the state components, the system 
of equations is (jIII,44|) . This system of equations is polynomial in the X m variables. With the 
Carleman bilinearization method, the system is transformed into HIII.45|) . Filled Young tableaux 
that appear in (|III.45|) help to construct new variables from the products of the X m variables. For 
rational transition probabilities, the equation is pil.54|) and was used to solve the Hill feedback 
control (ITLlSll . 

The procedure outline above can be applied to many other genetic networks such as networks 
with multistable steady states |2l| . The simplest network from this class is bistable; it toggles 
between two stable steady states. Biological examples of bistable systems include the lambda 
lysis-lysogeny switch and the hysteretic lac repressor system. Another avenue of research is to 
understand the feedback theory in terms of factorial cumulants. Negative feedback stabilizes the 
system whereas positive feedback is responsible for oscillations and multistable states. The feedback 
design principles are important in developing biomolecular devices. The question is thus: How to 
translate the feedback design theory from the classical control theory into the language of nonlinear 
stochastic genetic networks? The control theory for nonlinear stochastic genetic networks should 
also contain studies about observability and reachibility. From another perspective, studies on the 
factorial cumulants discard will be important. What is the minimum cummulat order we need to 
retain to reach a predefined precision for the output variables? 

From an experimental point of view, practical implementation of signal generators, on both 
the mRNA and protein level, will boost research on cell signaling. Experimentally it was proven 
that a source of oscillations propagates into a genetic network. Namely, in [3] a group of mice 
were exposed for 2 weeks to an external source of 12 hours light followed by 12 hours of darkness. 
This input external oscillator entrained the internal clock of the cell. The output signals (mRNA 
expression levels) were measured by sacrificing a mouse from the entrained group every 4 hours for 
2 days. The mice were kept in compete darkness during the 2 days measurement period, so only the 
internal clock will affect the mRNA levels. The data, collected with an Affymetrix (Santa Clara, CA) 
platform, showed that form ~ 6000 expressed genes, ~ 500 oscillated with a 24-h period. The next 
experiment would be to implement the light switch from Fig. 2, and drive one of the core component 
of the clock mechanism directly from its promoter. Then use a microarray experiment to measure 
the mRNA levels and find the set of genes that follow the frequency of signal generator. Besides 
a microarray design, which screen large sets of genes at few time points, an experimental design 
based on a phototube, j^jJl, can record the expression of few genes but in real time. Thus detailed 
information about the time variation of specific genes can be recorded. Such detailed information 
is crucial for developing a proper mathematical description of gene interaction. Models developed 
in the field of system identification in conjunction with the approach presented in this article, 
will help to better interpretate the measured data. With an input signal generator that acts on a 
target gene or protein, we can also measure the speed of propagation of the signal through the gene 
network. The speed of propagation can be very fast; for example the G protein-coupled receptor 
switches in milliseconds, [5l| . To conclude, understanding how the behavior of living cells emerges 
from a genetic network, experimental designs should be correlated with mathematical theories. We 
hope that the methods presented in this article will help to create new experimental designs for 
systems biology. 
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V. MATERIALS AND METHODS 
A. Monte Carlo Simulations for Time Dependent Transition Probabilities 

_The time dependent Direct Gillespie algorithm was used to generate the stochastic simulations 

We present here the Gillespie algorithm using the notations introduced before. 
We denote by p t (r \ q,t)dr the probability that the system will jump in direction e in the time 
interval [t + T,t + T + dr] if it stayed in the state q in the time interval [t, t+r]. In terms of transition 
probabilities: 



V< 



(r | q,t)dr = ^ T ^ q ' t+T ' )dT ' T € {q,t + r)dr (V.l) 



where T £ (q,t + r)dr is the probability that the system will jump from the state q in direction e 
in the time interval [t + r, t + r + dr\ , regardless of the system's history before t + r. The other term 

of ivrni 

e-EJo^ifeW)*" (V.2) 

is the probability that the system will stay in the state q in the time interval (t, t + r). 

To obtain (|V.2|) divide the interval (t, t+r) in small pieces (t + kd,t + (k + l)5) for k = 0, N — 1, 
with N5 = t. Then the probability that the system will stay in the state q in the time interval 
(t, t + r) is the product of the probabilities that the system will stay in the state q in the intervals 
(t + kS, t + (k + 1)5). Because 5 is small we can use the definition of the transition probability to 
find that 1 — T v (q, t + kS)6 is the probability that the system will stay in the state q in the time 
interval (t + kS, t + (k + 1)5). The fact that 5 is small makes also possible to bring this probability 
into an exponential form 

The exponential form will help to transform the^product of the probabilities into a sum. Mul- 
tiplying ()V.3|) for k = 1, N — 1 we obtain e~^''^° T? '( <?,i+T ) rfr ( which appear in the right side of 

(EU). 

The cumulative distribution function of p t (r \ q,t)dr is 



F(r\q,t)= ry2 P e(r"\q,t)dr" 
Jo e 

JO , 



= 1 - e -foY.^(q,t+T>)dTi _ 

After the transition took place at time t, the next transition will take place at t + r, with r a 
solution of the equation F(r \ q, t) = U±. Here U\ is a uniform random number from [0, 1] and q and 
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t are known. We will find r using the bisection method |52j. The root is bracketed in the interval 
[-ln(l - CJi)M(g)- 1 , -ln(l - ^i)m(g)- 1 ], where 

m(q) = inf xeR ^ T e (q, x) , (V.4) 



M(q) = sup xe n J2 T ^ x h (V.5) 

c 

and the procedure is stopped when an accuracy of 10~ a is reached. We used a = 5. 

After r was found, a second random number XJ<i is necessary to find which transition will take 
place. In other words, we have to find one e M from all possible transitions [i = 1 . . . T. The unknown 
H from the set of indices 1, T is obtained from 

M-i T fJ. 

J2Tz k (q,t + T)<U 2 Y,Te k (q,t + T)<Y,T €k (q,t + T). 

k=l k=l k=l 

Analytical computations and numerical analysis were done with Maple (Waterloo Maple Inc., 
Waterloo, Ontario, Canada) and Matlab (Mathworks, Natick, Massachusetts, United States). 
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